The Kalshnikov 691 Dependency Bank

نویسنده

  • Tomas By
چکیده

The PARC 700 dependency bank has a number of features that would seem to make it less than optimally suited for its intended purpose, parser evaluation. However, it is difficult to know precisely what impact these problems have on the evaluation results, and as a first step towards making comparison possible, a subset of the same sentences is presented here, marked up using a different format that avoids them. In this new representation, the tokens contain exactly the same sequence of characters as the original text, word order is encoded explicitly, and there is no artificial distinction between full tokens and attribute tokens. There is also a clear division between word tokens and empty nodes, and the token attributes are stored together with the word, instead of being spread out individually in the file. A standard programming language syntax is used for the data, so there is little room for markup errors. Finally, the dependency links are closer to standard grammatical terms, which presumably makes it easier to understand what they mean and to convert any particular parser output format to the Kalashnikov 691 representation. The data is provided both in machine-readable format and as graphical dependency trees. All that is complicated is unnecessary; all that is necessary is simple. Michail Kalashnikov

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Some notes on the PARC 700 Dependency Bank

The PARC 700 dependency bank is a potentially very useful resource for parser evaluation that has, so to speak, a high barrier to entry, because of tokenisation that is quite different from the source of the data, the Penn Treebank, and because there is no representation of word order, producing an uncertainty factor of some 15%. There is also a small, but perhaps not insignificant, number of e...

متن کامل

The quality and validation of structures from structural genomics.

Quality control of three-dimensional structures of macromolecules is a critical step to ensure the integrity of structural biology data, especially those produced by structural genomics centers. Whereas the Protein Data Bank (PDB) has proven to be a remarkable success overall, the inconsistent quality of structures reveals a lack of universal standards for structure/deposit validation. Here, we...

متن کامل

BANK OF ENGLISH AND BEYOND Hand-crafted parsers for functional annotation

The 200 million word corpus of the Bank of English was annotated morphologically and syntactically using the English Constraint Grammar analyser, a rulebased shallow parser developed at the Research Unit for Computational Linguistics, University of Helsinki. We discuss the annotation system and methods used in the corpus work, as well as the theoretical assumptions of the Constraint Grammar syn...

متن کامل

Measuring the Dependency of the Banks’ Assets and Liabilities in Iran

Analyzing the correlation between banks’ assets and liabilities after the financial crisis has been focused by many countries. As the banks in Iran have proved to be the biggest financer required for the production sector, investigating the asset and liability portfolio and their correlation appears to be very important. In this paper, there has been an attempt to patronize the Iranian banking ...

متن کامل

Bank Health and Cash Flow Sensitivity of Cash: Evidence from TSE Listed Firms

Achieving a strong and efficient monetary cycle is of great importance and necessity due to the dependency of corporates on banks in Iran. Owing to the importance of cash management in firms, this study assesses the impact of bank health on Cash flow sensitivity of cash of listed corporates on Tehran Stock Exchange (TSE) by analyzing 102 firms which have received facilities from 20 active Irani...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008